Speech event detection using multiband modulation energy
نویسندگان
چکیده
The need for efficient, sophisticated features for speech event detection is inherent in state of the art processing, enhancement and recognition systems. We explore ideas and techniques from non-linear speech modeling and analysis, like modulations and multiband filtering and propose new energy and spectral content features derived through filtering in multiple frequency bands and tracking dominant modulation energy in terms of the Teager-Kaiser Energy of separate AM-FM components. We present a detection-theoretic motivation and incorporate them in two detection schemes namely word boundary and voice activity detection. The modulation approach demonstrated noisy speech endpoint detection accuracy, reaching ∼40% error reduction on NTIMIT. In a voice activity scheme, improvement in overall misclassification error of a high hit-rate detector reached 7.5% on Aurora 2 and 9.5% on Aurora 3 databases.
منابع مشابه
Speech Event Detection using Multib
The need for efficient, sophisticated features for speech event detection is inherent in state of the art processing, enhancement and recognition systems. We explore ideas and techniques from non-linear speech modeling and analysis, like modulations and multiband filtering and propose new energy and spectral content features derived through filtering in multiple frequency bands and tracking dom...
متن کاملSpeech formant frequency and bandwidth tracking using multiband energy demodulation
In this paper, the amplitude and frequency AM–FM modulation model and a multiband demodulation analysis scheme are applied to formant frequency and bandwidth tracking of speech signals. Filtering by a bank of Gabor bandpass filters is performed to isolate each speech resonance in the signal. Next, the amplitude envelope AM and instantaneous frequency FM are estimated for each band using the ene...
متن کاملMultiband, multisensor robust features for noisy speech recognition
This paper presents a novel feature extraction scheme taking advantage of both the nonlinear modulation speech model and the spatial diversity of speech and noise signals in a multisensor environment. Herein, we propose applying robust features to speech signals captured by a multisensor array minimizing a noise energy criterion over multiple frequency bands. We show that we can achieve improve...
متن کاملSpeech analysis and synthesis using an AM-FM modulation model
In this paper, the AM{FM modulation model is applied to speech analysis, synthesis and coding. The multiband demodulation pitch tracking algorithm is proposed that produces smooth and accurate fundamental frequency contours. The AM{ FM modulation vocoder represents speech as the sum of resonance signals modeled by their amplitude envelope and instantaneous frequency signals. E cient modeling an...
متن کاملCombating nonlinear telephone channel-noise using the multiband AM-FM model
This study presents a novel technique to enhance telephone speech signals. This technique is based on the Amplitude and Frequency Modulation (AM-FM) model, which represents the speech signal as the sum of N successive AM-FM signals. Based on a leastmean-square error criterion, each AM-FM signal is modified using an iterative algorithm in order to compensate for the deformation of the signal cau...
متن کامل